{% block head %} {% endblock %} {% block content %}
1. As a city planner, I want to determine the best potential locations for new green walls in the city of Melbourne. What locations would be most beneficial?
Green walls have been proven to have a moderating effect on multiple environmental factors. In addition to this they provide wildlife corridors and visual amenity. I would like to visualise a range of relevant factors such as pollution, temperatures and tree coverage in order to determine the most impactful locations for the installation of new green walls around Melbourne.
At the end of this use case you will:
Issue being addressed
It is a priority of the City of Melbourne to create a healthier natural environment and adapt the city to climate change. These goals are outlined in the council's visions and goals.
One component of their plan to tackle these issues is Melbourne's Urban Forest Strategy. The strategy aims to improve the overall ecological health of the city by addressing canopy cover, plant diversity, vegetation health and ecology.
Green walls are vertically built structures intentionally covered by vegetation. They provide an extra option to assist in meeting several of these goals, enabling greening of areas with little available space. Several considerations, requirements and benefits associated with green walls can be explored further in the citys growing green guide. The datasets explored in this analysis provide insight into specific locations throughout the city where green walls are feasible and can have the greatest impact.
Datasets overview
All of the datasets used in this analysis come from the City of Melbourne's open data portal.
The first dataset to be analysed is that of the city's Microclimate sensors, which have been placed in several locations around the city. These provide a host of relevant environmental data including airborne particulate matter and ambient temperature. We will access this information (Microclimate Sensor Readings) and pair it with its location dataset (Microclimate Sensor Locations) to map the results and determine where green walls will have the greatest climatic effect.
We then perform the same process with the Pedestrian Counting System - Monthly (counts per hour) dataset and its pair Pedestrian Counting System - Sensor Locations to find the most heavily trafficked areas, where extra visual amenity will be seen by the most people.
The analyses conclude by plotting the citys tree canopy data and comparing it with butterfly and insect surveys. These show the direct relationship between canopy coverage and ecological diversity, which in turn highlights the need for green walls in areas with the least amount of canopy.
To begin we will import the required libraries and datasets to perform our exploratory analysis and visualisation of the datasets.
The following are core packages required for this exercise:
If you attempt to run this first cell and there is a 'module not found' error, you may need to install the package on your system. Try running: pip install -package name- to install the missing package. If this doesn't work, you may need to try Google!
#Depending on how you are running this notebook you may need some of the following packages.
# !pip install sodapy
# !pip install geopandas
# !pip install rtree
# !pip install pygeos
#Issue with using folium through Jupyter notebook was resolved by downgrading markupsafe as follows:
# !pip install markupsafe==2.0.1
#File manipulation
import os
from datetime import datetime
from sodapy import Socrata
#Data manipulation
import numpy as np
import pandas as pd
import geopandas as gpd
import seaborn as sns
from shapely.geometry import Polygon
#Visualisation
import matplotlib.pyplot as plt
import branca.colormap as cm
import folium
from folium.plugins import MarkerCluster
#Details for City of Melbourne data on Socrata website.
apptoken = os.environ.get("bajmvQjws2C8yfqmVSkOtOU9L") # App token created on Socrata website
domain = "data.melbourne.vic.gov.au"
client = Socrata(domain, apptoken) # Open Dataset connection
WARNING:root:Requests made without an app_token will be subject to strict throttling limits.
Retrieve all data from PM10 24 hour average sensors, with erroneous values removed, from within the last year.
# Retrieve hourly rolling 24 hour averages for PM10 since April 2021.
dataresource = client.get('u4vh-84j8', where="type='PM10-EPA-1h-NOPK-EPA-24h' AND local_time > '2021-04-11T20:00:00' LIMIT 75000")
pm10_dataset = pd.DataFrame.from_dict(dataresource)
pm10_dataset.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 58589 entries, 0 to 58588
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 58589 non-null object
1 site_id 58589 non-null object
2 gateway_hub_id 58589 non-null object
3 sensor_id 58589 non-null object
4 value 58571 non-null object
5 local_time 58589 non-null object
6 type 58589 non-null object
7 units 58589 non-null object
dtypes: object(8)
memory usage: 3.6+ MB
# Convert to correct data types
pm10_dataset[['id', 'site_id']] = pm10_dataset[['id', 'site_id']].astype(int)
pm10_dataset[['value']] = pm10_dataset[['value']].astype(float)
pm10_dataset['local_time'] = pd.to_datetime(pm10_dataset['local_time'] )
pm10_dataset = pm10_dataset.convert_dtypes()
pm10_dataset.dtypes
id Int32
site_id Int32
gateway_hub_id string
sensor_id string
value Float64
local_time datetime64[ns]
type string
units string
dtype: object
With datatypes sorted, view the total entries from each site and their mean PM10 recordings.
groupbyfields = ['site_id']
aggregatebyfields = ["count", "mean"]
# Create new DataFrame with value counts and mean values for each site.
pm10SiteCount = pd.DataFrame(pm10_dataset.groupby(groupbyfields, as_index=False)['value'].agg(aggregatebyfields)
.add_prefix('value_'))
pm10SiteCount = pm10SiteCount.reset_index()
pm10SiteCount
| site_id | value_count | value_mean | |
|---|---|---|---|
| 0 | 1001 | 1005 | 28.424478 |
| 1 | 1005 | 1406 | 28.789687 |
| 2 | 1006 | 869 | 25.733487 |
| 3 | 1007 | 8457 | 24.733877 |
| 4 | 1008 | 665 | 25.643759 |
| 5 | 1009 | 7829 | 28.572142 |
| 6 | 1010 | 7617 | 24.704805 |
| 7 | 1011 | 6244 | 23.735842 |
| 8 | 1012 | 5439 | 27.334216 |
| 9 | 1013 | 5251 | 26.425214 |
| 10 | 1014 | 5630 | 28.297247 |
| 11 | 1015 | 4343 | 24.607944 |
| 12 | 1016 | 3816 | 29.164911 |
Comparing the data to the standards
With the annual average national standard at 25 µg/m3 (Victorian standard 20 µg/m3), we can see that 9 of 13 sites have exceeded it and all are exceeding the Victorian standard. However, theres a varying amount of data collected for each site so we should check the first and last recorded dates.
groupbyfields = ['site_id']
aggregatebyfields = {'local_time': ["min", "max"]}
# Create new DataFrame highlighting start and end dates.
recordingDates= pd.DataFrame(pm10_dataset.groupby(groupbyfields, as_index=False).agg(aggregatebyfields))
recordingDates
| site_id | local_time | ||
|---|---|---|---|
| min | max | ||
| 0 | 1001 | 2021-04-11 21:00:00 | 2021-06-12 23:00:00 |
| 1 | 1005 | 2021-04-11 21:00:00 | 2021-06-13 00:00:00 |
| 2 | 1006 | 2021-05-20 00:00:00 | 2021-06-28 00:00:00 |
| 3 | 1007 | 2021-05-20 00:00:00 | 2022-05-22 04:00:00 |
| 4 | 1008 | 2021-06-15 11:00:00 | 2021-09-20 18:00:00 |
| 5 | 1009 | 2021-06-15 09:00:00 | 2022-05-23 19:00:00 |
| 6 | 1010 | 2021-06-29 00:00:00 | 2022-05-23 19:00:00 |
| 7 | 1011 | 2021-08-25 19:00:00 | 2022-05-23 19:00:00 |
| 8 | 1012 | 2021-09-20 19:00:00 | 2022-05-23 19:00:00 |
| 9 | 1013 | 2021-09-20 19:00:00 | 2022-05-10 21:00:00 |
| 10 | 1014 | 2021-09-20 19:00:00 | 2022-05-23 19:00:00 |
| 11 | 1015 | 2021-09-20 19:00:00 | 2022-05-23 12:00:00 |
| 12 | 1016 | 2021-09-20 19:00:00 | 2022-05-23 19:00:00 |
As evident above, it appears no sites actually have all of the last 12 months worth of data, making annual averages difficult to rely upon without looking further back for a better overall view.
Lets obtain all available data and look at the averages by months and see if we can see any clear trends that indicate a likely reduction of these averages within a yearly timeframe.
# Retrieve all hourly rolling 24 hour averages for PM10.
dataresource = client.get('u4vh-84j8', where="type='PM10-EPA-1h-NOPK-EPA-24h' LIMIT 100000")
all_pm10_dataset = pd.DataFrame.from_dict(dataresource)
all_pm10_dataset.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 100000 non-null object
1 site_id 100000 non-null object
2 gateway_hub_id 100000 non-null object
3 sensor_id 100000 non-null object
4 value 99979 non-null object
5 local_time 100000 non-null object
6 type 100000 non-null object
7 units 100000 non-null object
dtypes: object(8)
memory usage: 6.1+ MB
# Convert to correct data types
all_pm10_dataset[['id', 'site_id']] = all_pm10_dataset[['id', 'site_id']].astype(int)
all_pm10_dataset[['value']] = all_pm10_dataset[['value']].astype(float)
all_pm10_dataset['local_time'] = pd.to_datetime(all_pm10_dataset['local_time'] )
all_pm10_dataset = all_pm10_dataset.convert_dtypes()
all_pm10_dataset.dtypes
id Int32
site_id Int32
gateway_hub_id string
sensor_id string
value Float64
local_time datetime64[ns]
type string
units string
dtype: object
groupbyfields = ['site_id', pd.Grouper(key='local_time', freq='M')]
aggregatebyfields = {'value':['mean']}
# Create new DataFrame grouping the averages of each month since recording first began.
Pm10AllSiteCount = pd.DataFrame(all_pm10_dataset.groupby(groupbyfields).agg(aggregatebyfields))
Pm10AllSiteCount = Pm10AllSiteCount.reset_index()
# Plot this in a timeline graph
sns.set_theme()
fig, ax = plt.subplots(1,1);
Pm10AllSiteCount.groupby(['site_id']).plot(x='local_time', y='value', ax=ax, figsize=(15,7.5))
plt.legend([v[0] for v in Pm10AllSiteCount.groupby('site_id')['site_id']], loc='center left', bbox_to_anchor=(1, 0.5)).set_title('Sensor ID#')
plt.xlabel("Month")
plt.ylabel("Average PM$_{10}$ Reading")
plt.title("Average monthly PM$_{10}$ Reading (µg/m\u00b3) at Sites Across Melbourne - Nov 2019 to May 2022", y=1.1, fontsize=15)
# Add lines representing the national and victorian annual standards.
ax.axhline(y = 25, color = 'r', label='National Annual Standard')
ax.axhline(y = 20, color = 'b', label='Victorian Annual Standard')
<matplotlib.lines.Line2D at 0x1593488da00>
Interpreting the graph
We can see all sites generally follow the same trend however based on current and past data there's no reason to expect any significant drops that would reach Victorias annual standard. Of the sites still currently recording data, sites 1009, 1012, 1016 & 1014 are showing the highest averages.
Looking purely at the last year of data
groupbyfields = ['site_id', pd.Grouper(key='local_time', freq='M')]
aggregatebyfields = {'value':['mean']}
# Create new DataFrame grouping the averages of each month over the last year.
Pm10AllSiteCount = pd.DataFrame(pm10_dataset.groupby(groupbyfields).agg(aggregatebyfields))
Pm10AllSiteCount = Pm10AllSiteCount.reset_index()
# Plot this in a timeline graph
fig, ax = plt.subplots(1,1);
Pm10AllSiteCount.groupby(['site_id']).plot(x='local_time', y='value', ax=ax, figsize=(15,7.5))
plt.legend([v[0] for v in Pm10AllSiteCount.groupby('site_id')['site_id']], loc='center left', bbox_to_anchor=(1, 0.5)).set_title('Sensor ID#')
plt.xlabel("Month")
plt.ylabel("Average PM$_{10}$ Reading")
plt.title("Average monthly PM$_{10}$ Reading (µg/m\u00b3) at Sites Across Melbourne - Apr 2021 to May 2022", y=1.1, fontsize=15)
# Add lines representing the national and victorian annual standards.
ax.axhline(y = 25, color = 'r', label='National Annual Standard')
ax.axhline(y = 20, color = 'b', label='Victorian Annual Standard')
<matplotlib.lines.Line2D at 0x159347b7cd0>
Mapping sensor data
Let's now look at merging our data with the sensor location dataset so we can visualise the results.
Sensor location dataset: https://data.melbourne.vic.gov.au/Environment/Microclimate-Sensor-Locations/irqv-hjr4
#Loading Melbourne Microclimate Sensor Location data
dataresource = client.get('irqv-hjr4')
# Retrieve data from Microclimate Sensor Locations dataset.
location_dataset = pd.DataFrame.from_dict(dataresource)
location_dataset.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 site_id 16 non-null object
1 gateway_hub_id 16 non-null object
2 site_status 16 non-null object
3 start_reading 16 non-null object
4 end_reading 7 non-null object
5 longitude 16 non-null object
6 latitude 16 non-null object
7 location 16 non-null object
8 last_data 9 non-null object
dtypes: object(9)
memory usage: 1.2+ KB
# Set site_id to int for merging
location_dataset[['site_id']] = location_dataset[['site_id']].astype(int)
location_dataset[['longitude', 'latitude']] = location_dataset[['longitude', 'latitude']].astype(float)
# Merging sensor location data with reading data.
pm10 = pd.merge(pm10SiteCount, location_dataset, on='site_id')
# Drop unused columns and convert remaining to string datatypes.
pm10 = pm10.drop(['gateway_hub_id', 'end_reading', 'last_data', 'location'], axis=1)
pm10 = pm10.convert_dtypes()
pm10.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 13 entries, 0 to 12
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 site_id 13 non-null Int64
1 value_count 13 non-null Int64
2 value_mean 13 non-null Float64
3 site_status 13 non-null string
4 start_reading 13 non-null string
5 longitude 13 non-null Float64
6 latitude 13 non-null Float64
dtypes: Float64(3), Int64(2), string(2)
memory usage: 897.0 bytes
# Drop retired sensors, remaining sensors running since at least September 2021.
pm10 = pm10[pm10['site_status'] != 'R']
pm10
| site_id | value_count | value_mean | site_status | start_reading | longitude | latitude | |
|---|---|---|---|---|---|---|---|
| 3 | 1007 | 8457 | 24.733877 | C | 2021-05-20T00:00:00.000 | 144.951841 | -37.822447 |
| 5 | 1009 | 7829 | 28.572142 | C | 2021-06-14T00:00:00.000 | 144.965711 | -37.816854 |
| 6 | 1010 | 7617 | 24.704805 | C | 2021-06-29T00:00:00.000 | 144.952228 | -37.822487 |
| 7 | 1011 | 6244 | 23.735842 | C | 2021-07-07T00:00:00.000 | 144.952228 | -37.822209 |
| 8 | 1012 | 5439 | 27.334216 | C | 2021-09-17T00:00:00.000 | 144.970006 | -37.81332 |
| 9 | 1013 | 5251 | 26.425214 | C | 2021-09-17T00:00:00.000 | 144.956395 | -37.811931 |
| 10 | 1014 | 5630 | 28.297247 | C | 2021-09-17T00:00:00.000 | 144.967228 | -37.817487 |
| 11 | 1015 | 4343 | 24.607944 | C | 2021-09-17T00:00:00.000 | 144.972506 | -37.810265 |
| 12 | 1016 | 3816 | 29.164911 | C | 2021-09-17T00:00:00.000 | 144.960562 | -37.812765 |
Relevant Dataset
Pedestrian Counting System - Monthly (counts per hour). Retrieve all data recorded since the 11th of April 2021 to help identify high pedestrian traffic areas.
dataset_id = 'b2ak-trbp' # Pedestrian Counting Dataset
dataresource = client.get(dataset_id, where="Date_Time > '2021-04-11T20:00:00' LIMIT 7500000") # Select years worth of recording
pedest_count = pd.DataFrame.from_dict(dataresource)
pedest_count.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 632002 entries, 0 to 632001
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 632002 non-null object
1 date_time 632002 non-null object
2 year 632002 non-null object
3 month 632002 non-null object
4 mdate 632002 non-null object
5 day 632002 non-null object
6 time 632002 non-null object
7 sensor_id 632002 non-null object
8 sensor_name 632002 non-null object
9 hourly_counts 632002 non-null object
dtypes: object(10)
memory usage: 48.2+ MB
# Drop columns not being used
pedest_count = pedest_count.drop(['id', 'year', 'month', 'day', 'mdate', 'time'], axis=1)
# Convert to correct data types
pedest_count[[ 'sensor_id', 'hourly_counts']] = pedest_count[['sensor_id', 'hourly_counts']].astype(int)
pedest_count['date_time'] = pd.to_datetime(pedest_count['date_time'] )
pedest_count = pedest_count.convert_dtypes()
pedest_count.dtypes
date_time datetime64[ns]
sensor_id Int32
sensor_name string
hourly_counts Int32
dtype: object
Calculating hourly averages
With columns and datatypes sorted, lets focus on the average hourly pedestrian count for each sensor.
groupbyfields = ['sensor_id']
aggregatebyfields = ["mean"]
# Create new DataFrame with mean values for each site.
pedest_mean = pd.DataFrame(pedest_count.groupby(groupbyfields)['hourly_counts'].agg(aggregatebyfields).astype(int)
.add_prefix('hourly_counts_'))
pedest_mean = pedest_mean.reset_index()
pedest_mean
| sensor_id | hourly_counts_mean | |
|---|---|---|
| 0 | 1 | 619 |
| 1 | 2 | 367 |
| 2 | 3 | 556 |
| 3 | 4 | 778 |
| 4 | 5 | 674 |
| ... | ... | ... |
| 70 | 78 | 43 |
| 71 | 79 | 368 |
| 72 | 84 | 800 |
| 73 | 85 | 77 |
| 74 | 86 | 54 |
75 rows × 2 columns
Pairing datasets
Lets now obtain the paired dataset containing each sensors location.
dataset_id = 'h57g-5234' # Pedestrian Counting Sensor Locations Dataset
dataresource = client.get(dataset_id)
pedest_loc = pd.DataFrame.from_dict(dataresource)
pedest_loc.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 85 entries, 0 to 84
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sensor_id 85 non-null object
1 sensor_description 85 non-null object
2 sensor_name 85 non-null object
3 installation_date 85 non-null object
4 status 85 non-null object
5 latitude 85 non-null object
6 longitude 85 non-null object
7 location 85 non-null object
8 direction_1 65 non-null object
9 direction_2 65 non-null object
10 note 10 non-null object
dtypes: object(11)
memory usage: 7.4+ KB
# Drop columns not liekly to be used
pedest_loc = pedest_loc.drop(['sensor_description', 'direction_1', 'direction_2', 'location', 'note'], axis=1)
# Convert to correct data types
pedest_loc[[ 'sensor_id']] = pedest_loc[['sensor_id']].astype(int)
pedest_loc[['latitude', 'longitude']] = pedest_loc[['latitude', 'longitude']].astype(float)
pedest_loc['installation_date'] = pd.to_datetime(pedest_loc['installation_date'])
pedest_loc = pedest_loc.convert_dtypes()
pedest_loc.dtypes
sensor_id Int32
sensor_name string
installation_date datetime64[ns]
status string
latitude Float64
longitude Float64
dtype: object
# Merging sensor location data with counting data.
pedest = pd.merge(pedest_mean, pedest_loc, on='sensor_id')
Observations
With our data merged, can see below that only one sensor (ID: 79) was installed during the period of data we're examining, we should take note of this as it could factor into its average. Otherwise we have a complete timeframe.
pedest
| sensor_id | hourly_counts_mean | sensor_name | installation_date | status | latitude | longitude | |
|---|---|---|---|---|---|---|---|
| 0 | 1 | 619 | Bou292_T | 2009-03-24 | A | -37.813481 | 144.965159 |
| 1 | 2 | 367 | Bou283_T | 2009-03-30 | A | -37.813794 | 144.965173 |
| 2 | 3 | 556 | Swa295_T | 2009-03-25 | A | -37.811002 | 144.964301 |
| 3 | 4 | 778 | Swa123_T | 2009-03-23 | A | -37.814867 | 144.966094 |
| 4 | 5 | 674 | PriNW_T | 2009-03-26 | A | -37.818729 | 144.967883 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 70 | 78 | 43 | HarEsB_T | 2021-03-30 | A | -37.814703 | 144.944657 |
| 71 | 79 | 368 | FliSS_T | 2021-09-10 | A | -37.817927 | 144.966173 |
| 72 | 84 | 800 | ElFi_T | 2021-12-03 | A | -37.817967 | 144.96504 |
| 73 | 85 | 77 | 488Mac_T | 2021-12-21 | A | -37.794311 | 144.92974 |
| 74 | 86 | 54 | 574Qub_T | 2021-12-21 | A | -37.803087 | 144.949087 |
75 rows × 7 columns
Mapping Insect Density
This analysis combines three MOP datasets to calculate the relationship between tree canopy and number of different insect species present at given locations. This relationship provides further justification of the green wall locations chosen, as the locations are chosen from areas around the city with low tree canopy coverage.
First we load and format the relevant datasets:
#Butterfly_biodiversity_survey_2017
bFlyDataResource = client.get('kmtd-nvqr',where="sloc >= 0 LIMIT 75000") #Note SoQL query is designed to retrieve all records.
bflydata = pd.DataFrame.from_dict(bFlyDataResource)
#Insect_Records_in_the_City_of_Melbourne__from_Little_Things_that_Run_the_City_
insectDataResource = client.get('47ai-9yup',where="Taxa = 'Insect' LIMIT 75000") #Note SoQL query is designed to retrieve all records.
insectdata = pd.DataFrame.from_dict(insectDataResource)
#Importing tree canopy data as geoJSON
from urllib.request import urlopen
import json
geoJSON_Id = 'tq3k-d7n7'
GeoJSONURL = 'https://'+domain+'/api/geospatial/'+geoJSON_Id+'?method=export&format=GeoJSON'
with urlopen(GeoJSONURL) as response:
block = json.load(response)
canopies = gpd.GeoDataFrame.from_features(block["features"])
canopies.crs = "EPSG:4326" #Setting coordinate reference system (CRS) for the format of the data (WGS84)
#Viewing the butterfly dataset
bflydata
| site | sloc | walk | date | time | vegwalktime | vegspecies | vegfamily | lat | lon | ... | tabe | brow | csem | aand | jvil | paur | ogyr | gmac | datetime | location | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Womens Peace Gardens | 2 | 1 | 2017-01-29T00:00:00.000 | 11:07 | 0.3069 | Schinus molle | Anacardiaceae | -37.7913 | 144.9241 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-01-29T11:07:00.000 | {'latitude': '-37.7913', 'longitude': '144.9241'} |
| 1 | Womens Peace Gardens | 2 | 1 | 2017-02-26T00:00:00.000 | 11:42 | 1.3128 | Schinus molle | Anacardiaceae | -37.7912 | 144.9244 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-02-26T11:42:00.000 | {'latitude': '-37.7912', 'longitude': '144.9244'} |
| 2 | Murchinson Square | 1 | 1 | 2017-01-26T00:00:00.000 | 10:10 | 0.2306 | Asteraceae 1 | Asteraceae | -37.8001 | 144.9729 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-01-26T10:10:00.000 | {'latitude': '-37.8001', 'longitude': '144.9729'} |
| 3 | Fitzroy-Treasury Gardens | 3 | 1 | 2017-03-09T00:00:00.000 | 10:21 | 0.1818 | Begonia sp. | Begoniaceae | -37.8143 | 144.9814 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-03-09T10:21:00.000 | {'latitude': '-37.8143', 'longitude': '144.9814'} |
| 4 | Westgate Park | 4 | 3 | 2017-03-03T00:00:00.000 | 17:27 | 1.3333 | Xerochrysum viscosum | Asteraceae | -37.8303 | 144.9102 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-03-03T17:27:00.000 | {'latitude': '-37.8303', 'longitude': '144.9102'} |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4055 | Westgate Park | 1 | 1 | 2017-02-05T00:00:00.000 | 09:39 | 1.0397 | Bursaria spinosa | Pittosporaceae | -37.832 | 144.9061 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-02-05T09:39:00.000 | {'latitude': '-37.832', 'longitude': '144.9061'} |
| 4056 | Argyle Square | 1 | 2 | 2017-02-11T00:00:00.000 | 11:01 | 0.0525 | Salvia sp. | Lamiaceae | -37.8024 | 144.9663 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-02-11T11:01:00.000 | {'latitude': '-37.8024', 'longitude': '144.9663'} |
| 4057 | Westgate Park | 4 | 1 | 2017-02-02T00:00:00.000 | 11:05 | 1.5948 | Xerochrysum viscosum | Asteraceae | -37.8316 | 144.9093 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-02-02T11:05:00.000 | {'latitude': '-37.8316', 'longitude': '144.9093'} |
| 4058 | Argyle Square | 1 | 2 | 2017-02-11T00:00:00.000 | 11:04 | 0.0639 | Salvia sp. | Lamiaceae | -37.8023 | 144.9664 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-02-11T11:04:00.000 | {'latitude': '-37.8023', 'longitude': '144.9664'} |
| 4059 | Carlton Gardens South | 3 | 1 | 2017-01-30T00:00:00.000 | 12:42 | 1.4437 | Asteraceae 1 | Asteraceae | -37.8044 | 144.9704 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2017-01-30T12:42:00.000 | {'latitude': '-37.8044', 'longitude': '144.9704'} |
4060 rows × 42 columns
#Aggregating the butterfly data as count of individual species by monitoring location.
bflyCounts = bflydata.groupby(["site"])["site"].count().reset_index(name="Butterfly Species Count")
bflyCounts.insert(0, 'id', range(1, 1 + len(bflyCounts)))
bflyCounts
| id | site | Butterfly Species Count | |
|---|---|---|---|
| 0 | 1 | Argyle Square | 256 |
| 1 | 2 | Canning/Neill St Reserve | 40 |
| 2 | 3 | Carlton Gardens South | 592 |
| 3 | 4 | Fitzroy-Treasury Gardens | 572 |
| 4 | 5 | Gardiner Reserve | 28 |
| 5 | 6 | Garrard Street Reserve | 48 |
| 6 | 7 | Lincoln Square | 124 |
| 7 | 8 | Murchinson Square | 52 |
| 8 | 9 | Pleasance Gardens | 56 |
| 9 | 10 | Princes Park | 224 |
| 10 | 11 | Royal Park | 1152 |
| 11 | 12 | State Library of Victoria | 84 |
| 12 | 13 | University Square | 24 |
| 13 | 14 | Westgate Park | 588 |
| 14 | 15 | Womens Peace Gardens | 220 |
#Viewing the insect dataset
insectdata
| taxa | kingdom | phylum | class | order | family | genus | species | identification_notes | common_name | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HEMIPTERA | TRIOZIDAE | Aacanthocnema | dobsoni | Royal Park | NaN |
| 1 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HEMIPTERA | TRIOZIDAE | Aacanthocnema | dobsoni | Royal Park | NaN |
| 2 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HEMIPTERA | TRIOZIDAE | Aacanthocnema | dobsoni | Royal Park | NaN |
| 3 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HEMIPTERA | TRIOZIDAE | Aacanthocnema | dobsoni | Westgate Park | NaN |
| 4 | Insect | ANIMALIA | ARTHROPODA | INSECTA | COLEOPTERA | CURCULIONIDAE | Acalonoma | NaN | Royal Park | Acalonoma sp. 1 (pusillum group) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 3690 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HYMENOPTERA | TIPHIIDAE | NaN | NaN | Royal Park | Tiphiidae 2 |
| 3691 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HYMENOPTERA | TIPHIIDAE | NaN | NaN | Royal Park | Tiphiidae 2 |
| 3692 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HYMENOPTERA | TORYMIDAE | NaN | NaN | Canning/Neil Street Reserve | Torymidae 1 |
| 3693 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HYMENOPTERA | TORYMIDAE | NaN | NaN | Garrard Street Reserve | Torymidae 1 |
| 3694 | Insect | ANIMALIA | ARTHROPODA | INSECTA | HYMENOPTERA | TORYMIDAE | NaN | NaN | Princes Park | Torymidae 1 |
3695 rows × 10 columns
#Aggregating the insect data as count of individual species by monitoring location.
insectCounts = insectdata.groupby(["identification_notes"])["identification_notes"].count().reset_index(name="Miscellaneous Insect Count")
#Creating id index
insectCounts.insert(0, 'id', range(1, 1 + len(insectCounts)))
#Renaming the monitoring location column to something more intuitive.
insectCounts = insectCounts.rename({'identification_notes': 'Location'}, axis=1)
insectCounts
| id | Location | Miscellaneous Insect Count | |
|---|---|---|---|
| 0 | 1 | Argyle Square | 105 |
| 1 | 2 | Canning/Neil Street Reserve | 15 |
| 2 | 3 | Carlton Gardens South | 273 |
| 3 | 4 | Fitzroy-Treasury Gardens | 557 |
| 4 | 5 | Gardiner Reserve | 51 |
| 5 | 6 | Garrard Street Reserve | 33 |
| 6 | 7 | Lincoln Square | 79 |
| 7 | 8 | Murchinson Square | 22 |
| 8 | 9 | Pleasance Gardens | 76 |
| 9 | 10 | Princes Park | 446 |
| 10 | 11 | Royal Park | 1249 |
| 11 | 12 | State Library of Victoria | 105 |
| 12 | 13 | University Square | 43 |
| 13 | 14 | Westgate Park | 520 |
| 14 | 15 | Women's Peace Gardens | 121 |
Creating shapefile to match monitoring locations
The insect and butterfly data only has the names of the monitoring locations listed for its geographical data. We will create a shapefile with coordinates taken from an online map. Including the correct coordinates reference system (CRS) is essential for the points to be in the right place.
#Creating point data shapefile to align with the monitoring locations. Coordinates taken from external mapping software (QGIS).
insectCoords = pd.DataFrame({'longitude': [2496990.9139893204, 2497716.4106032336, 2497468.0019974816,2497904.2864259947,2495048.0358508285,2494903.373709257,2496730.7911691167,2497631.1602672394,2495215.229639408,2496593.58352767,
2495657.343150738,2496862.1697277585,2496501.887571214,2491904.086409784,2493312.0341030643],
'latitude': [2410917.9145240453, 2411981.946331717, 2410514.362637444,2409637.309870564,2411319.224455721,2413521.833072339,2410956.1398554766,2411196.477423501,2411624.2424233076,2412883.3243095265,
2412807.994624026,2410110.0260665533,2411092.4507149556,2407712.4794691135,2412201.7700121445]})
insectCounts['geometry'] = gpd.points_from_xy(insectCoords.longitude, insectCoords.latitude, crs="EPSG:3111")
insectCounts
| id | Location | Miscellaneous Insect Count | geometry | |
|---|---|---|---|---|
| 0 | 1 | Argyle Square | 105 | POINT (2496990.914 2410917.915) |
| 1 | 2 | Canning/Neil Street Reserve | 15 | POINT (2497716.411 2411981.946) |
| 2 | 3 | Carlton Gardens South | 273 | POINT (2497468.002 2410514.363) |
| 3 | 4 | Fitzroy-Treasury Gardens | 557 | POINT (2497904.286 2409637.310) |
| 4 | 5 | Gardiner Reserve | 51 | POINT (2495048.036 2411319.224) |
| 5 | 6 | Garrard Street Reserve | 33 | POINT (2494903.374 2413521.833) |
| 6 | 7 | Lincoln Square | 79 | POINT (2496730.791 2410956.140) |
| 7 | 8 | Murchinson Square | 22 | POINT (2497631.160 2411196.477) |
| 8 | 9 | Pleasance Gardens | 76 | POINT (2495215.230 2411624.242) |
| 9 | 10 | Princes Park | 446 | POINT (2496593.584 2412883.324) |
| 10 | 11 | Royal Park | 1249 | POINT (2495657.343 2412807.995) |
| 11 | 12 | State Library of Victoria | 105 | POINT (2496862.170 2410110.026) |
| 12 | 13 | University Square | 43 | POINT (2496501.888 2411092.451) |
| 13 | 14 | Westgate Park | 520 | POINT (2491904.086 2407712.479) |
| 14 | 15 | Women's Peace Gardens | 121 | POINT (2493312.034 2412201.770) |
Creating Final dataset
Now we will combine the three datasets to determine the amount of insects counted at each location in relation to the area of canopy cover found within 50, 100 and 200m of the monitoring locations.
#Adding butterfly data to insect data.
insectCounts = pd.merge(left=insectCounts, right=bflyCounts, how='left', left_on='id', right_on='id')
insectCounts = insectCounts.drop(['site'], axis=1)
#Summing counts of misc. insects and butterflies.
insectCounts['Total Insect Count'] = insectCounts["Miscellaneous Insect Count"] + insectCounts["Butterfly Species Count"]
#Converting dataframe to geopandas format.
insectCounts = gpd.GeoDataFrame(insectCounts, geometry='geometry')
insectCounts
| id | Location | Miscellaneous Insect Count | geometry | Butterfly Species Count | Total Insect Count | |
|---|---|---|---|---|---|---|
| 0 | 1 | Argyle Square | 105 | POINT (2496990.914 2410917.915) | 256 | 361 |
| 1 | 2 | Canning/Neil Street Reserve | 15 | POINT (2497716.411 2411981.946) | 40 | 55 |
| 2 | 3 | Carlton Gardens South | 273 | POINT (2497468.002 2410514.363) | 592 | 865 |
| 3 | 4 | Fitzroy-Treasury Gardens | 557 | POINT (2497904.286 2409637.310) | 572 | 1129 |
| 4 | 5 | Gardiner Reserve | 51 | POINT (2495048.036 2411319.224) | 28 | 79 |
| 5 | 6 | Garrard Street Reserve | 33 | POINT (2494903.374 2413521.833) | 48 | 81 |
| 6 | 7 | Lincoln Square | 79 | POINT (2496730.791 2410956.140) | 124 | 203 |
| 7 | 8 | Murchinson Square | 22 | POINT (2497631.160 2411196.477) | 52 | 74 |
| 8 | 9 | Pleasance Gardens | 76 | POINT (2495215.230 2411624.242) | 56 | 132 |
| 9 | 10 | Princes Park | 446 | POINT (2496593.584 2412883.324) | 224 | 670 |
| 10 | 11 | Royal Park | 1249 | POINT (2495657.343 2412807.995) | 1152 | 2401 |
| 11 | 12 | State Library of Victoria | 105 | POINT (2496862.170 2410110.026) | 84 | 189 |
| 12 | 13 | University Square | 43 | POINT (2496501.888 2411092.451) | 24 | 67 |
| 13 | 14 | Westgate Park | 520 | POINT (2491904.086 2407712.479) | 588 | 1108 |
| 14 | 15 | Women's Peace Gardens | 121 | POINT (2493312.034 2412201.770) | 220 | 341 |
#Creating circles (buffers) of 50m and 100m radius around the insect monitoring locations. These will be combined with the tree canopy data to determine relationship between canopy and insect variety.
insectBuffer50 = insectCounts.buffer(50)
insectBuffer100 = insectCounts.buffer(100)
insectBuffer200 = insectCounts.buffer(200)
insectBuffer50 = gpd.GeoDataFrame(geometry=gpd.GeoSeries(insectBuffer50))
insectBuffer100 = gpd.GeoDataFrame(geometry=gpd.GeoSeries(insectBuffer100))
insectBuffer200 = gpd.GeoDataFrame(geometry=gpd.GeoSeries(insectBuffer200))
insectBuffer50.insert(0, 'id', range(1, 1 + len(insectBuffer50)))
insectBuffer100.insert(0, 'id', range(1, 1 + len(insectBuffer100)))
insectBuffer200.insert(0, 'id', range(1, 1 + len(insectBuffer200)))
#Creating areas of canopy around insect monitoring locations. Note may need to install rtree to enable the overlay operation.
insectCanopies50 = canopies.to_crs(3577).overlay(insectBuffer50.to_crs(3577), how='intersection')
insectCanopies100 = canopies.to_crs(3577).overlay(insectBuffer100.to_crs(3577), how='intersection')
insectCanopies200 = canopies.to_crs(3577).overlay(insectBuffer200.to_crs(3577), how='intersection')
print(insectCanopies50)
id_1 id_2 geometry
0 104130 14 POLYGON ((1141446.045 -4193561.856, 1141445.92...
1 104087 14 POLYGON ((1141435.462 -4193555.236, 1141435.44...
2 104138 14 MULTIPOLYGON (((1141446.591 -4193561.924, 1141...
3 104322 14 POLYGON ((1141410.164 -4193573.187, 1141410.03...
4 104304 14 POLYGON ((1141414.561 -4193572.480, 1141414.43...
.. ... ... ...
491 73820 4 POLYGON ((1147641.480 -4192318.236, 1147641.60...
492 73796 4 POLYGON ((1147642.071 -4192317.558, 1147641.94...
493 73641 4 POLYGON ((1147614.847 -4192308.184, 1147614.72...
494 73615 4 POLYGON ((1147614.221 -4192307.105, 1147614.09...
495 74827 4 POLYGON ((1147586.901 -4192352.051, 1147587.21...
[496 rows x 3 columns]
#Calculating the area of canopy cover and adding it as a variable.
insectCanopies50['area'] = insectCanopies50['geometry'].area
insectCanopies100['area'] = insectCanopies100['geometry'].area
insectCanopies200['area'] = insectCanopies200['geometry'].area
#Summing canopy area by insect study location.
insectCanopies50 = insectCanopies50.groupby(["id_2"], as_index=False)["area"].sum()
insectCanopies100 = insectCanopies100.groupby(["id_2"], as_index=False)["area"].sum()
insectCanopies200 = insectCanopies200.groupby(["id_2"], as_index=False)["area"].sum()
insectCanopies50 = insectCanopies50.rename({'id_2': 'id',"area": "Canopy_Area_within_50m"}, axis=1)
insectCanopies100 = insectCanopies100.rename({'id_2': 'id',"area": "Canopy_Area_within_100m"}, axis=1)
insectCanopies200 = insectCanopies200.rename({'id_2': 'id',"area": "Canopy_Area_within_200m"}, axis=1)
#Adding the calculated areas to the insect dataset.
insectCounts = pd.merge(left=insectCounts, right=insectCanopies50, how='left', left_on='id', right_on='id')
insectCounts = pd.merge(left=insectCounts, right=insectCanopies100, how='left', left_on='id', right_on='id')
insectCounts = pd.merge(left=insectCounts, right=insectCanopies200, how='left', left_on='id', right_on='id')
#Viewing the final insect dataset.
insectCounts
| id | Location | Miscellaneous Insect Count | geometry | Butterfly Species Count | Total Insect Count | Canopy_Area_within_50m | Canopy_Area_within_100m | Canopy_Area_within_200m | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Argyle Square | 105 | POINT (2496990.914 2410917.915) | 256 | 361 | 4337.761239 | 11708.661318 | 19431.520072 |
| 1 | 2 | Canning/Neil Street Reserve | 15 | POINT (2497716.411 2411981.946) | 40 | 55 | 1718.869834 | 3274.540483 | 8310.062987 |
| 2 | 3 | Carlton Gardens South | 273 | POINT (2497468.002 2410514.363) | 592 | 865 | 6593.746242 | 18617.315930 | 46600.421530 |
| 3 | 4 | Fitzroy-Treasury Gardens | 557 | POINT (2497904.286 2409637.310) | 572 | 1129 | 5041.864217 | 16344.909060 | 49381.679929 |
| 4 | 5 | Gardiner Reserve | 51 | POINT (2495048.036 2411319.224) | 28 | 79 | 4387.109784 | 12071.034682 | 30428.417613 |
| 5 | 6 | Garrard Street Reserve | 33 | POINT (2494903.374 2413521.833) | 48 | 81 | 1638.678259 | 4009.880939 | 16333.862172 |
| 6 | 7 | Lincoln Square | 79 | POINT (2496730.791 2410956.140) | 124 | 203 | 3196.918871 | 7475.327496 | 12903.552974 |
| 7 | 8 | Murchinson Square | 22 | POINT (2497631.160 2411196.477) | 52 | 74 | 1858.354637 | 3426.253100 | 26320.635387 |
| 8 | 9 | Pleasance Gardens | 76 | POINT (2495215.230 2411624.242) | 56 | 132 | 2812.159284 | 7391.812237 | 25713.525523 |
| 9 | 10 | Princes Park | 446 | POINT (2496593.584 2412883.324) | 224 | 670 | 2470.257730 | 5664.470683 | 25053.417166 |
| 10 | 11 | Royal Park | 1249 | POINT (2495657.343 2412807.995) | 1152 | 2401 | 5081.587581 | 20608.872287 | 62154.569734 |
| 11 | 12 | State Library of Victoria | 105 | POINT (2496862.170 2410110.026) | 84 | 189 | 2321.042462 | 3800.754053 | 9519.126325 |
| 12 | 13 | University Square | 43 | POINT (2496501.888 2411092.451) | 24 | 67 | 827.727272 | 1271.522227 | 10851.672239 |
| 13 | 14 | Westgate Park | 520 | POINT (2491904.086 2407712.479) | 588 | 1108 | 919.884951 | 3973.459726 | 18654.878985 |
| 14 | 15 | Women's Peace Gardens | 121 | POINT (2493312.034 2412201.770) | 220 | 341 | 2330.300969 | 8792.687481 | 33695.272141 |
Graphing the data
Now that we've combined our insect and tree canopy datasets, we can compare the results to view the relationship between number of insect species and tree canopy coverage.
from scipy.stats import pearsonr
from matplotlib.pyplot import figure
m, b = np.polyfit(insectCounts['Canopy_Area_within_50m'], insectCounts['Total Insect Count'], 1) #Line of best fit.
corr, _= pearsonr(insectCounts['Canopy_Area_within_50m'],insectCounts['Total Insect Count'])
figure(figsize=(8, 6), dpi=80)
plt.plot(insectCounts['Canopy_Area_within_50m'], insectCounts['Total Insect Count'], 'o')
plt.plot(insectCounts['Canopy_Area_within_50m'], m*insectCounts['Canopy_Area_within_50m']+ b)
plt.title("Count of Different Insect Species vs Tree Canopy Area Within 50 Metres of Monitoring Location", y=1.1, fontsize=15)
plt.xlabel("Area of Tree Canopy Coverage $\mathregular{m^{2}}$")
plt.ylabel("Number of Insect Species Present")
print()
plt.show()
print("50m Pearsons correlation =", corr)
print()
50m Pearsons correlation = 0.47805394899273107
m, b = np.polyfit(insectCounts['Canopy_Area_within_100m'], insectCounts['Total Insect Count'], 1) #Line of best fit.
corr, _= pearsonr(insectCounts['Canopy_Area_within_100m'],insectCounts['Total Insect Count'])
figure(figsize=(8, 6), dpi=80)
plt.plot(insectCounts['Canopy_Area_within_100m'], insectCounts['Total Insect Count'], 'o')
plt.plot(insectCounts['Canopy_Area_within_100m'], m*insectCounts['Canopy_Area_within_100m']+ b)
plt.title("Count of Different Insect Species vs Tree Canopy Area Within 100 Metres of Monitoring Location", y=1.1, fontsize=15)
plt.xlabel("Area of Tree Canopy Coverage $\mathregular{m^{2}}$")
plt.ylabel("Number of Insect Species Present")
print()
plt.show()
print("100m Pearsons correlation =", corr)
print()
100m Pearsons correlation = 0.6973749996557823
m, b = np.polyfit(insectCounts['Canopy_Area_within_200m'], insectCounts['Total Insect Count'], 1) #Line of best fit.
corr, _= pearsonr(insectCounts['Canopy_Area_within_200m'],insectCounts['Total Insect Count'])
figure(figsize=(8, 6), dpi=80)
plt.plot(insectCounts['Canopy_Area_within_200m'], insectCounts['Total Insect Count'], 'o')
plt.plot(insectCounts['Canopy_Area_within_200m'], m*insectCounts['Canopy_Area_within_200m']+ b)
plt.title("Count of Different Insect Species vs Tree Canopy Area Within 200 Metres of Monitoring Location", y=1.1, fontsize=15)
plt.xlabel("Area of Tree Canopy Coverage $\mathregular{m^{2}}$")
plt.ylabel("Number of Insect Species Present")
print()
plt.show()
print("200m Pearsons correlation =", corr)
print()
200m Pearsons correlation = 0.777741424908217
Results
We can see from the graphs that as we expected, there is a strong relationship between tree canopy coverage and number of insect species present at a location. In our analysis, the relationship becomes stronger when a wider area is considered for canopy coverage. The correlation between count of insect species and canopy coverage ranged from 0.47 when considering canopy upto 50m from the monitoring site to 0.77 when considering upto 200m.
0.77 indicates a strong linear relationship.
0.47 still indicates a relationship, but the results are not as decisive.
Intuitively it makes sense to consider the wider area, as many flying insects would have a range of at least 200m. Some travel thousands of kilometres!
These results reinforce the need for green walls at locations with little canopy coverage.
Tree Canopy Area
First we'll prepare the canopy data for the map, to highlight the areas with less vegetation. The canopy data currently covers an area much bigger than what we're interested in, which is the CBD. We'll create a polygon and trim the data to the CBD.
#Creating CBD polygon.
from shapely.geometry import Polygon
lon_point_list = [144.944077, 144.970996, 144.975450, 144.948066]
lat_point_list = [-37.813788,-37.807214,-37.815545,-37.822770]
cbd_geom = Polygon(zip(lon_point_list, lat_point_list))
crs = {'init': 'epsg:4326'}
cbdBoundary = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[cbd_geom])
cbdBoundary = cbdBoundary.to_crs(epsg=4326)
#Clipping canopy data to only the Melbourne CBD.
cbdCanopy = canopies.to_crs(4326).overlay(cbdBoundary, how='intersection')
C:\Users\Julian\anaconda3\lib\site-packages\pyproj\crs\crs.py:130: FutureWarning: '+init=<authority>:<code>' syntax is deprecated. '<authority>:<code>' is the preferred initialization method. When making the change, be mindful of axis order changes: https://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6
in_crs_string = _prepare_from_proj_string(in_crs_string)
Pedestrian Traffic
Here we visualise the hourly average for pedestrian traffic via scaled blue circles. Large circles indicate busier locations.
# Sections before and after where lat/long is added in order to open
# a new tab containing the google street view of the location.
linkPart1 = "<br><a href=https://www.google.com/maps/@?api=1&map_action=pano&viewpoint="
linkPart2 = "&source=outdoor\" target=\"_blank\">Street View</a>"
# Colour map based on PM10 values.
colormap = cm.LinearColormap(colors=['green','yellow','red'], index=[20, 25, 30], vmin=20, vmax=30,
caption='Average PM10 Sensor readings')
# Create (f)igure and base (m)ap with style and zoom level.
f = folium.Figure(width=1200, height=700)
m = folium.Map(location=[-37.81368709240999, 144.95738102347036], tiles = 'CartoDB positron', zoom_start=15, width=1200, height=700)
# Feature Group for pedestrian count layer.
pc = folium.FeatureGroup(name="Pedestrian Count")
# Add sensor locations, popup information (ID, Avg ped count, Street View) and increase size with value.
for _, locat in pedest.iterrows():
pc.add_child(
folium.CircleMarker(
location = [locat['latitude'], locat['longitude']],
popup = ("ID: " + str(locat['sensor_id']) + "<br>People: " + str(locat['hourly_counts_mean']) +
linkPart1 + str(locat['latitude']) + "," + str(locat['longitude']) + linkPart2),
radius = locat['hourly_counts_mean'] / 25,
fill = True
)
)
PM10 Readings
The following code visualises the paired 24 hour averaged sensor data we obtained above. Circle colour (green to red) is representative of the sensors reading average.
NOTE:
You may need adjust your location in the street view tab by clicking the nearest street in the bottom left-hand map to avoid viewing user-generated google maps images.
# Feature Group for Particulate Matter 10 layer.
pm = folium.FeatureGroup(name="Particulate Matter 10")
# Add sensor locations, popup information (ID, PM10 value, Street View) and colour based on their mean value to Feature Group.
for _, locat in pm10.iterrows():
pm.add_child(
folium.CircleMarker(
location = [locat['latitude'], locat['longitude']],
popup = ("ID: " + str(locat['site_id']) + "<br>Average PM10 Concentration: " + str(round(locat['value_mean'], 2)) +
linkPart1 + str(locat['latitude']) + "," + str(locat['longitude']) + linkPart2),
color = colormap(locat['value_mean']),
fill = True,
fill_opacity = 0.4
)
)
#Adding the canopy data.
m.choropleth(cbdCanopy)
# Add mean pedestrian count feature group to map.
m.add_child(pc)
# Add particulate matter 10 feature group to map.
m.add_child(pm)
# Add Legend for PM colour levels.
colormap.add_to(m)
# Add layer control to map.
m.add_child(folium.LayerControl())
#Add map to figure
m.add_to(f)
f
C:\Users\Julian\anaconda3\lib\site-packages\folium\folium.py:409: FutureWarning: The choropleth method has been deprecated. Instead use the new Choropleth class, which has the same arguments. See the example notebook 'GeoJSON_and_choropleth' for how to do this.
warnings.warn(